Some details of my AGI design
By Jonathan Standley
Figure
1: Figure
2:
The system level
architecture of the AGI design (click for full size image) System components composed of Concept
Space Networks are highlighted in white (click for full size image)
The concept most central to the design is that of the Concept Space Network, or CSN. Figure 2 (above) highlights the CSN components of the system. At its most basic level, a CSN is nothing but a collection of network nodes. The number of nodes is arbitrary, and nodes can be added as the system evolves. These nodes can be mapped to a 2-dimensional plane, and this property is in fact crucial to much of the systems
functionality.
A critical relationship in this design is the relationship between the sensory system and the various CSNs. Features of the environment that are perceived by the sensory system activate specific clusters of nodes in the CSNs. The specific nodes that correspond to a given env. Feature are not important; what is important is the fact that a given feature always activates the same node cluster(s). As a result, everything the system knows is “symbol grounded”, i.e. at least partially derived from sensory perception.
The system has been designed to be very similar to that of a human mind, and similarly to a newborn, at initialization an AI based on my design will not ‘know’ anything.
Section 1.1 The Foundation of CSN-based Cognition
The problem of cognition has always been the stumbling block in AI research. Some notable efforts have attempted, and at least partially succeeded, to solve this problem through brute-force methods. The CYC project is an example of this, as all the knowledge in the systems Knowledge Base (KB) was entered by hand(!). CYC has some impressive abilities, especially in the area of logical reasoning and algorithmic inference. In no way do I question the validity of the many approaches to cognition that exist today in the artificial intelligence field. However, most likely to due to the highly multidisciplinary nature of my formal- and self-directed- education, I choose to take an approach quite different from the current ‘mainstream’ in AI research.
I have thus developed a unique (afaik) model of cognition that unites cognition and knowledge, and provides for simple methods of generating high-level thought processes such as creative problem solving and ‘intuitive’ inference and deduction. I realize this is a grand claim, but if you read on you will see why I make such a bold statement above.
Figure 3: Figure 4:
Details of a core CSN concept (click picture to
enlarge) A
detail of the steps involved in a simple CSN cognitive operation,
In
this instance the creation of a Functional Link
A critical feature of the CSN-based cognition model is the adjustable “opacity” of the various CSN-mapped planes. This allows any of the various analytical, executive, goal-finding, etc. processes to ‘view’ any CSN(s) ‘through’ any other CSN(s). When the appropriate CSNs have been chosen for ‘viewing’ arbitrary analysis can be conducted. Figures 3 & 4 illustrate this idea in a visual manner. Adobe Photoshop’s filters and layer system are good for explaining via analogy this aspect of CSN-based cognition.
The Functional Link is perhaps the most basic of CSN node interconnects. In it’s most simple form, a functional link is simply an interconnection through which nodal activation propagates. These links can be between individual nodes, individual node clusters, or indeed any level of network organization. Going beyond the simplest form, functional links can carry out many simple functions, hence their name. Many of these functions are analogous to actions performed by human brain cells. Inhibition and Excitation are 2 very important functional link effects. Because spreading node activation is a key feature of CSN-based cognition, functional links, despite their limited functionality, are crucial system components. It should be noted that in relevant diagrams, arrows indicate the primary direction of activation along functional links.
Procedural Links are the means by which a CSN is gradually transformed from a knowledge base into a powerful integrated cognition and memory unit. Procedural links are small programs, constructed out of a simple but Turing-complete language. The links connect network nodes, and specify what the target node does when it is activated by said link. Like Functional Links, node activation spreads by means of these links. Another aspect of Functional Links that is shared with Procedural Links (PLs) is that they can be uni - or bi – directional. The program associated with a given PL tells the destination node what to do upon activation. In addition, the PL programs can contain links themselves, either to CSN structures or to the
‘executive’, non-CSN-based processes. Program links to executive processes are in many ways analogous to function calls in a programming language. At initialization, creation of PLs is the responsibility solely of the executive processes. As the new mind develops, this changes, with CSN-based cognitive structures creating a percentage of new PLs.
Relational Links (RLs) are used to specify relationships between network nodes, and during early development they are critical to the creation and organization of the AI’s first knowledge base, the foundation upon which all further factual knowledge is built. Data such as “x is a part of y” or “z is equivalent to b” are examples of the types of relationships RLs are used to denote. It is important to note that the variables in the data examples of the previous sentence can represent anything, concrete or abstract. The below image will give you some examples of the use of Relational Links
Figure 5:
Examples of RL usage and types (click to enlarge)
The Directive Components are those system components that act in an explicitly purposeful manner upon initialization. Figure 6, below, shows the Directive Components (they are the red components in the grayed out area).
Figure
6:
The Directive Components. (click to enlarge)
Note that this is the same image as Figure 2
Directive Components (DCs) serve many important purposes in the system. Initially, all DCs are not CSN based, and are defined and created by the system designer. Upon initialization, the DCs provide basic motivational functionality, developmental vectoring, CSN modification, and indeed all ‘executive’ level tasks.
Upon initialization, DCs motivate the ‘newborn’ AI much the way a human baby is motivated by certain instincts. Human babies have an innate ability to recognize faces, among many other instinctual behaviors and abilities. One major DC driven motivation will be curiosity. Curiosity, in my opinion, is crucial to the ascent to intelligence and self-awareness. It is worth noting that there is a strong correlation between curiosity and intelligence in studies of various animal species.
The majority of motivators will be based on what is known about childhood development. Behaviors such as imitation will be explicitly driven by DC systems, until enough relevant CSN-based functionality exists, when such behaviors become integrated with the AI’s ‘self’. At such point, the involved DC(s) will turn over control to CSN cognitive functions, and take up a passive monitoring role in which the newly empowered CSN structures are periodically evaluated on multiple criteria, such as efficiency and adherence to functionality. If significant discrepancies between desired performance and actual performance occur, the monitoring DC(s) may intervene.
DC motivation is not solely in the realm of behavior. Emotional aspects of development will also be affected. For example, the AI will have emotional states introduced via DC intervention. The desires for attention, bonding with caregivers, the need for ‘love’ are examples of this.
Developmental Vectoring is similar in many aspects to DC-driven motivation, but it is more specific and goal-oriented. These goals are usually long term in nature. While motivators will encourage general behaviors such as curiosity, Developmental Vectors (DV’s) are responsible for, among others, language acquisition and ensuring that social development stays within well-defined parameters. DV components will have to be much more complex than the DCs which handle motivational functions. It is conceivable that DVs will have built into their architecture the capacity for easy, direct intervention by human programmers. Of course the goal is to have the new mind develop ‘naturally’, without meddling directly with the mind’s innards. But imagine how useful it would be if one could identify the seeds of mental illness in a 1 year old child and “reach in” at the neural level to fix the potential problem. This is certainly possible, and will probably be available within 20 years. With an AI, however there is no need for nanobots or advanced neurosurgical techniques, the nature of the artificial mind itself is conducive to direct intervention.
In the AI, Executive Functions (Efs) are the phenomena we refer to as “conscious thinking”. In this AI design, these functions are handled entirely by DCs at initialization. I should point out that at initialization, executive functioning would be nearly non-existent, as it is in a newborn. Most of the built in executive DC’s will be dormant at first; they will be activated by the mind’s attainment of specified developmental milestones. Here is an example of what this entails: There will be a built in function that carries out conscious prediction of the effects of ‘physics’. In other words, a module that explicitly handles reasoning such as “since the ball is on a hill, it will roll downhill”. Without a knowledge base about the nature of the physical world, this module is useless. Since the entirety of the Knowledge Base (KB) is dependent upon perception of the environment, this module will remain inactive until the mind has built up a rudimentary model of how the world works. This understanding could be built into the executive function, but there are certain disadvantages to that approach. Since the function would act on concepts outside of itself, it will not unduly force the mind to think a certain way. For example, the function will handle the logic that governs the motion of a ball, but the mind’s idea and mental image of the concept “ball” can be any variation on the general idea of a ball.
Like all other DCs, executive functions are designed to be slowly ‘replaced’ by CSN cognitive structures that will take over said function’s tasks. This is why procedural links are so important; cognition need not be the domain of CSNs, but such an architecture would be significantly less adaptable than a system in which all mental activity and functionality is CSN based.
As stated in the first paragraph of sec. 2.3, more than just specific conscious tasks are handled by executive functions. There will be a cluster of Executive Function DC’s that will perform general conscious cognition. They will differ markedly from task-specific Efs in that they will be highly generalized, “fuzzy” processes that rely almost entirely upon non-algorithmic methods. Linked closely with the AI’s current mental state and its self-model, these functions will not operate in a deterministic manner. Most of the ‘building blocks’ of the general cognition Efs will be heuristic processes, each of which is amenable to modification. These processes will operate stochastically in their handling of input; the input itself will be ‘fuzzy’ and incorporate a certain degree of uncertainty and randomness (but not too much). The higher levels of the general cognitive Efs will mirror the nature of the basic elements, and will include heuristics that work with and upon heuristics. [See Douglas Lenat’s Eurisko program for an example of what I’m getting at] The AI’s goals, values, and beliefs, influenced by its current emotional and cognitive state and the current environment will all work together in guiding the AI’s behavior and thoughts towards certain probabilities. These guided thought patterns correspond with human consciousness & conscious thought.
All cognition feeds back into itself; this self-reference is part of where consciousness arises from in systems based upon my design [see Douglas Hofstadter’s G.E.B. for more on this and related theories of consciousness.]
There are a number of DC’s that do not neatly fit into the categories discussed above.